How to handle SIGSEGV, but also generate a core dump

Posted on February 8, 2009, 12:03 am, by Alexander Sandler, under Blog, Howto, Short articles.

Recently I ran into this problem. How do you capture SIGSEGV with a signal handler and still generate a core file?

The problem is that once you have your own signal handler for SIGSEGV, Linux will not call default signal handler which generates the core file. So, once you got SIGSEGV, consider all that useful information about about origin of the exception, lost.

Luckily, there’s a solution. Here’s what I did.

You start with registering a signal handler. Once you get the signal, inside of the signal handler, set signal handler for the signal to SIG_DFL. Then send yourself same signal, using kill() system call. Here’s a short code snippet that demonstrates this little trick in action.

#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <signal.h>

void sighandler(int signum)
{
    printf("Process %d got signal %d\n", getpid(), signum);
    signal(signum, SIG_DFL);
    kill(getpid(), signum);
}

int main()
{
    signal(SIGSEGV, sighandler);
    printf("Process %d waits for someone to send it SIGSEGV\n",
        getpid());
    sleep(1000);

    return 0;
}

Note that this code doesn’t actually cause a segmentation fault. To simulate segmentation fault, I did kill -11 <pid> from the command line. This is what happened.

$ ls
sigs.c
$ gcc sigs.c
$ ./a.out
Process 2149 waits for someone to send it SIGSEGV
Process 2149 got signal 11
Segmentation fault (core dumped)
$ ls
a.out*  core  sigs.c

Obviously, without lines 9 and 10 in the code, there would not be core file.

By the way, you can use this technique to handle any core generating exception – SIGILL, SIGFPE, etc.

Bookmark: digg, del.icio.us, reddit, stumbleupon, technorati, twitter, google, yahoo, facebook
Comment (RSS) | Trackback

Did you know that you can receive periodical updates with the latest articles that I write right into your email box? Alternatively, you subscribe to the RSS feed!

Want to know how? Check out
Subscribe page

34 Comments

Signal handling in Linux - Alex on Linux says:

April 19, 2009 at 8:17 pm

[…] case you still want to handle exception signals, read my How to handle SIGSEGV, but also generate a core dump […]

Reply to this comment
dam says:

October 12, 2009 at 2:03 pm

Thanks Alexander. Nice trick. I was exactly looking this..

Reply to this comment
Alexander Sandler says:

October 14, 2009 at 9:36 am

@dam
Thanks! Please visit again

Reply to this comment
Gal says:

November 27, 2009 at 10:46 am

I use’ed this code in a shell I wrote in order to use ctrl – c & ctrl v on process the shell runs.
for some senarios its caused a segmentation fault.
for example – a process that was suspended (on suspended Q) and than movind it to forground (+ remove from Q) and then the handler ==>caused seg_foult ==> do you know why? its happen also when I didnt removed it from the Q but only moved it from suspended to FG
Thanks

Reply to this comment
Eli says:

January 7, 2010 at 3:46 pm

Did you try this way?

http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_21.html#SEC353

Reply to this comment
pradeep says:

September 20, 2010 at 8:53 am

Hi Alexander,
I am new to linux and trying to understand the signal .Your artcles are very intresting . I have tried the above program .But didn’t see the core dump file. This is what I have tried

pradeepk@ipglx29> ./a.out
Process 18123 waits for someone to send it SIGSEGV
Process 18123 got signal 11
Segmentation fault

ls -l
total 11
-rwxrwxr-x 1 pradeepk users 9265 2010-09-20 12:15 a.out*
-rw-rw-r– 1 pradeepk users 396 2010-09-20 12:15 sig.c
~/IPC [ NONE ]

From another terminal I have tried kill -11 pid

Thanks for the help.

Regards
Pradeep

Reply to this comment
latanius says:

September 20, 2010 at 11:58 am

Nice, works, thanks Alexander! (btw, without this, catching a (real, null-pointer-dereference) SIGSEGV results in calling the signal handler again and again, presumably because the control returns to the faulting address after the handler runs… so this thing seems to be quite useful.)

@pradeep:
core dumps aren’t created by default, you have to explicitly enable them, see for example this article:
http://aplawrence.com/Linux/limit_core_files.html

Reply to this comment
Alexander Sandler says:

September 26, 2010 at 11:17 pm

@pradeep and @latanius, it seems you guys have found each other.
Oh btw, I often use “ulimit -c unlimited” – that’s because I usually have no idea what will be the size of the core file.
Thanks to both of you for your notes and please come again

Reply to this comment
Chris Eleveld says:

October 19, 2010 at 11:55 pm

I see you are re-signallying the same signal. Since the original signal is not guaranteed to be one providing a core dump have you considered:
gcore(), abort(), or { char *cp = 0; *cp = ‘1’; }
all of which seem to be fairly short and highly portable. Watch out for the last one if you have a SIGBUS or SIGSEGV handlers installed. Also you may need to uset system() or fork/exec to run the gcore utility on linux since gcore does not seem to be standard on linux yet.

Reply to this comment
Alexander Sandler says:

November 8, 2010 at 4:35 pm

@Chris Eleveld
I didn’t try any of these. What will happen if you open core file that you generated with *cp = ‘1’; with gdb? I bet it will be the line in signal handler that generated the exception. If you do it the way I suggested, it will pin-point you to the place where exception took place and not place where you resend it from.

Reply to this comment
Andreas says:

March 17, 2011 at 4:01 pm

Why reconfiguring the signal handler? SIG_DFL is just a function pointer. You can call SIG_DFL also directly something like this:

…

void sighandler(int signum)
{
printf(“Process %d got signal %d\n”, getpid(), signum);
SIG_DFL(signum);
}
…

Reply to this comment
Alexander Sandler says:

April 14, 2011 at 7:59 am

@Andreas
I believe you are right, but I’d still stick to the method I described. This is because type of SIG_DFL is implementation dependent. No-one promises that it is a pointer to a valid function.

Reply to this comment
10110111 says:

April 22, 2011 at 11:46 am

You can call abort() from signal handler, it will also generate core dump, so no need in magic you described.

Reply to this comment
Bart says:

October 6, 2011 at 1:03 pm

Originally Posted By 10110111You can call abort() from signal handler, it will also generate core dump, so no need in magic you described.

Using abort() in your signal handler will raise signal SIGABRT.
Yes, you get a core file…… of your signal handler, not the process that actually crashed.

Reply to this comment
id says:

November 28, 2011 at 12:18 am

how did u give it the signal 11. i execute the same way but never never got signal 11.

Reply to this comment
Gabe Black says:

February 2, 2012 at 10:41 pm

@Andreas –

yes SIG_DFL’s type is a function pointer, however, if you look at what its value is it is set to 0. So obviously you couldn’t call SIG_DFL(signum). A signal is actually captured first by the kernel and it can be forwarded on to a custom signal handler if the signal handler is non-zero. Alexander has it right to consider SIG_DFL implementation dependent and not assume it is a pointer to a valid function (which the current implementation does not have it going to a valid function).

Reply to this comment
Alexander Sandler says:

February 5, 2012 at 10:33 pm

@id
Doesn’t kill -11 from command line works?

Reply to this comment
Mansour says:

April 14, 2012 at 11:21 pm

If you setup a custom handler using sigaction(), and set sa_flags = SA_RESTART, then after reverting the handler to SIG_DFL, you won’t need to send yourself SIGSEGV as the second attempt to execute the offending instructions will cause a normal segfault.

Not sure how this is more useful, just a thought.

Reply to this comment
popen: safe to call in signal handler? says:

May 3, 2012 at 3:20 pm

[…] creation of a core dump, or signal something that a core dump is about to occur using this trick: How to handle SIGSEGV, but also generate a core dump – Alex on Linux But don't call printf like in the example since that's not in the list. […]

Reply to this comment
RickS says:

May 11, 2012 at 9:48 pm

Hi Alexander:

I am trying to debug a program that uses the exact same method of re triggering a SIGSEGV fault in the signal handler as you describe. However when trying to analyze the resulting core dump I cannot seem to get a useful backtrace to where the offending instruction occurred:

(gdb) bt
#0 0xb7b7d3b1 in kill () from /lib/libc.so.6
#1 0xb7f8dd2d in CEventHandler::defaultMachineSignalHandler (signo=11) at ../source/Event.cpp:369
#2 0xb7fc0420 in ?? ()
#3 0x0000000b in ?? ()
#4 0x00000033 in ?? ()
#5 0x00000000 in ?? ()

The stack contents between the initial SIGSEGV and the handler don’t allow me to see where the original fault occured. The program is embedded multi-threaded, uses shared objects and cross compiled from another system. The system it is running on is:
uname -a
Linux D400 2.6.18.2-ASAT #1 PREEMPT Wed Jul 21 11:40:47 MDT 2010 i686 i686 i386 GNU/Linux

Do you know of a way to unwind the stack before the core dump is generated, so that I can get info on the original fault?

Any help you could provide would be appreciated.

Reply to this comment
Alex says:

September 4, 2012 at 12:25 pm

Hi Alex,

I confirm the trouble reported by RickS.
Your solution shows valid call stack for core generated on Ubuntu, but not on Red Hat Linux. In this case backtrace shows something unrelated to the real place which caused singnal to be sent.

So probably this is not universal method

Reply to this comment
engineer says:

August 14, 2013 at 11:49 am

Hi alexander,
I double confirm the trouble reported by RickS and Alex.

When I run multi threaded program, the core file seems to point to a place completely unrelated to where the seg fault has occurred. Is there any work around for this??

Say there are 3 thread m (main), t1, t2
if segv happened in t2, I am seeing the core file to point to some unrelated code in m. I think by the time you catch the signal, do cleanup and generate core file, the stack is getting changed..

Reply to this comment
Alexander Sandler says:

September 6, 2013 at 12:52 am

@engineer
@Alex
@RickS
I think there is a workaround. Instead of sending kill to getpid(), send it to gettid(). See here:
http://www.alexonlinux.com/how-to-obtain-unique-thread-identifier-on-linux

In Linux, when you send a signal to a process, it will be handled by arbitrary thread. So when you kill getpid(), any threads in the process handles it. So, instead of sending kill to getpid(), send it to the sick thread and you will get right backtrace.
I’ll try to confirm that it works some time later.

Reply to this comment
- moe says:
  
  June 15, 2015 at 11:31 pm
  
  You can also use raise(signum) – which is guaranteed to send the signal to the calling thread.
  
  Reply to this comment
Libin says:

March 13, 2014 at 1:36 pm

nice one.. was searching for something like this..

Reply to this comment
thomsy says:

May 6, 2014 at 11:09 pm

Why can’t we use SA_RESETHAND?

Reply to this comment
Venu says:

June 3, 2014 at 7:40 am

Why are we re-raising the signal after calling default handler. The default handler supposed to be generate core and terminate the application right?

Here:
http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_21.html#SEC353

There is no invocation of default handler, hence re-raised. But here I couldn’t get the reason.

Reply to this comment
Himanshu says:

June 16, 2015 at 1:03 pm

hey, thanks Alexander!
it was helpful!!

Reply to this comment
Mohit says:

September 3, 2015 at 7:36 pm

i Am able to handle the Signal SEGV, but after that process goes for sleep

and if we send the request again then also it is not invoking.

Is there any solution in which the specific thread( which caused SEGV) can be terminated and all other thread keep on running.

Thanks..!

Reply to this comment
Dip says:

August 2, 2016 at 5:09 pm

#include
#include
#include
#include

void sighandler(int signum)
{
printf(“Process %d got signal %d\n”, getpid(), signum);
signal(signum, SIG_DFL);
kill(getpid(), signum);
}

int main()
{
signal(SIGSEGV, sighandler);
printf(“Process %d waits for someone to send it SIGSEGV\n”,
getpid());
int *p = NULL;
printf(“++creating seg fault\n”);
*p = 10;

return 0;
}

Why this one doesn’t give the proper back trace? Any help is appreciated.

Reply to this comment
- Vince says:
  
  November 22, 2016 at 12:43 am
  
  What did works for me is to reset the handler with signal(signum, SIG_DFL);
  and let the handler return.
  
  Vince
  
  Reply to this comment
那些年，我的书签 – 南宁开发者 says:

January 27, 2018 at 9:15 am

[…] How to handle SIGSEGV, but also generate a core dump – Alex on Linux […]

Reply to this comment
那些年，我的书签 - 南宁开发者 - 中小企业信息化方案研究 says:

December 22, 2018 at 7:39 am

[…] How to handle SIGSEGV, but also generate a core dump – Alex on Linux […]

Reply to this comment
eye says:

March 5, 2022 at 4:35 pm

eye

How to handle SIGSEGV, but also generate a core dump – Alex on Linux

Reply to this comment